Overview

Dataset statistics

Number of variables24
Number of observations50000
Missing cells289
Missing cells (%)< 0.1%
Duplicate rows6615
Duplicate rows (%)13.2%
Total size in memory9.0 MiB
Average record size in memory188.0 B

Variable types

CAT12
NUM10
BOOL2

Warnings

Dataset has 6615 (13.2%) duplicate rows Duplicates
country has a high cardinality: 154 distinct values High cardinality
arrival_date has a high cardinality: 793 distinct values High cardinality
previous_cancellations is highly skewed (γ1 = 28.90866083) Skewed
lead_time has 3915 (7.8%) zeros Zeros
stays_in_weekend_nights has 21640 (43.3%) zeros Zeros
stays_in_week_nights has 3818 (7.6%) zeros Zeros
previous_cancellations has 49619 (99.2%) zeros Zeros
previous_bookings_not_canceled has 47735 (95.5%) zeros Zeros
booking_changes has 39823 (79.6%) zeros Zeros
days_in_waiting_list has 49116 (98.2%) zeros Zeros
average_daily_rate has 1166 (2.3%) zeros Zeros
total_of_special_requests has 24493 (49.0%) zeros Zeros

Reproduction

Analysis started2022-10-09 00:39:28.049113
Analysis finished2022-10-09 00:39:54.336119
Duration26.29 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

hotel
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
City_Hotel
30752 
Resort_Hotel
19248 
ValueCountFrequency (%) 
City_Hotel3075261.5%
 
Resort_Hotel1924838.5%
 
2022-10-08T21:39:54.478440image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:54.540933image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:54.619050image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length10
Mean length10.76992
Min length10

lead_time
Real number (ℝ≥0)

ZEROS

Distinct414
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80.09412
Minimum0
Maximum709
Zeros3915
Zeros (%)7.8%
Memory size390.6 KiB
2022-10-08T21:39:54.744036image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q18
median45
Q3125
95-th percentile269
Maximum709
Range709
Interquartile range (IQR)117

Descriptive statistics

Standard deviation91.20136192
Coefficient of variation (CV)1.13867737
Kurtosis2.220205704
Mean80.09412
Median Absolute Deviation (MAD)42
Skewness1.514854429
Sum4004706
Variance8317.688415
MonotocityNot monotonic
2022-10-08T21:39:54.882788image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
039157.8%
 
120464.1%
 
212582.5%
 
311422.3%
 
410312.1%
 
58851.8%
 
68311.7%
 
77901.6%
 
86111.2%
 
115701.1%
 
Other values (404)3692173.8%
 
ValueCountFrequency (%) 
039157.8%
 
120464.1%
 
212582.5%
 
311422.3%
 
410312.1%
 
ValueCountFrequency (%) 
7091< 0.1%
 
54212< 0.1%
 
51817< 0.1%
 
50410< 0.1%
 
47916< 0.1%
 

stays_in_weekend_nights
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.92852
Minimum0
Maximum19
Zeros21640
Zeros (%)43.3%
Memory size390.6 KiB
2022-10-08T21:39:55.040656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9962883425
Coefficient of variation (CV)1.072985334
Kurtosis10.40739741
Mean0.92852
Median Absolute Deviation (MAD)1
Skewness1.51207629
Sum46426
Variance0.9925904614
MonotocityNot monotonic
2022-10-08T21:39:55.134402image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%) 
02164043.3%
 
21384027.7%
 
11303126.1%
 
48261.7%
 
35641.1%
 
6410.1%
 
522< 0.1%
 
818< 0.1%
 
104< 0.1%
 
72< 0.1%
 
Other values (7)12< 0.1%
 
ValueCountFrequency (%) 
02164043.3%
 
11303126.1%
 
21384027.7%
 
35641.1%
 
48261.7%
 
ValueCountFrequency (%) 
191< 0.1%
 
181< 0.1%
 
162< 0.1%
 
142< 0.1%
 
132< 0.1%
 

stays_in_week_nights
Real number (ℝ≥0)

ZEROS

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.46454
Minimum0
Maximum50
Zeros3818
Zeros (%)7.6%
Memory size390.6 KiB
2022-10-08T21:39:55.243759image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum50
Range50
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.936176016
Coefficient of variation (CV)0.7856135489
Kurtosis31.42440699
Mean2.46454
Median Absolute Deviation (MAD)1
Skewness2.987397579
Sum123227
Variance3.748777564
MonotocityNot monotonic
2022-10-08T21:39:55.352414image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
11361927.2%
 
21251325.0%
 
3916118.3%
 
547799.6%
 
440208.0%
 
038187.6%
 
66161.2%
 
104881.0%
 
74811.0%
 
83030.6%
 
Other values (21)2020.4%
 
ValueCountFrequency (%) 
038187.6%
 
11361927.2%
 
21251325.0%
 
3916118.3%
 
440208.0%
 
ValueCountFrequency (%) 
501< 0.1%
 
421< 0.1%
 
411< 0.1%
 
401< 0.1%
 
351< 0.1%
 

adults
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.83028
Minimum0
Maximum4
Zeros194
Zeros (%)0.4%
Memory size390.6 KiB
2022-10-08T21:39:55.477425image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5090778966
Coefficient of variation (CV)0.2781420857
Kurtosis0.8729303839
Mean1.83028
Median Absolute Deviation (MAD)0
Skewness-0.3993167894
Sum91514
Variance0.2591603048
MonotocityNot monotonic
2022-10-08T21:39:55.571151image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
23627172.5%
 
11083121.7%
 
326755.3%
 
01940.4%
 
4290.1%
 
ValueCountFrequency (%) 
01940.4%
 
11083121.7%
 
23627172.5%
 
326755.3%
 
4290.1%
 
ValueCountFrequency (%) 
4290.1%
 
326755.3%
 
23627172.5%
 
11083121.7%
 
01940.4%
 

children
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
none
45962 
children
 
4038
ValueCountFrequency (%) 
none4596291.9%
 
children40388.1%
 
2022-10-08T21:39:55.664880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:55.727383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:55.805505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length4
Mean length4.32304
Min length4

meal
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
BB
38316 
HB
6399 
SC
4494 
Undefined
 
580
FB
 
211
ValueCountFrequency (%) 
BB3831676.6%
 
HB639912.8%
 
SC44949.0%
 
Undefined5801.2%
 
FB2110.4%
 
2022-10-08T21:39:55.932624image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:56.010741image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:56.120089image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length2
Mean length2.0812
Min length2

country
Categorical

HIGH CARDINALITY

Distinct154
Distinct (%)0.3%
Missing289
Missing (%)0.6%
Memory size390.6 KiB
PRT
14046 
GBR
6405 
FRA
5627 
ESP
4298 
DEU
4047 
Other values (149)
15288 
ValueCountFrequency (%) 
PRT1404628.1%
 
GBR640512.8%
 
FRA562711.3%
 
ESP42988.6%
 
DEU40478.1%
 
IRL16913.4%
 
ITA16073.2%
 
BEL12502.5%
 
NLD11232.2%
 
USA10592.1%
 
Other values (144)855817.1%
 
2022-10-08T21:39:56.260714image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique35 ?
Unique (%)0.1%
2022-10-08T21:39:56.388271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.9866
Min length2

market_segment
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
Online_TA
23760 
Offline_TA/TO
10604 
Direct
7131 
Groups
5124 
Corporate
2832 
Other values (2)
 
549
ValueCountFrequency (%) 
Online_TA2376047.5%
 
Offline_TA/TO1060421.2%
 
Direct713114.3%
 
Groups512410.2%
 
Corporate28325.7%
 
Complementary4270.9%
 
Aviation1220.2%
 
2022-10-08T21:39:56.481557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:56.544057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:56.684679image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length13
Median length9
Mean length9.14474
Min length6
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
TA/TO
38349 
Direct
8083 
Corporate
 
3459
GDS
 
108
Undefined
 
1
ValueCountFrequency (%) 
TA/TO3834976.7%
 
Direct808316.2%
 
Corporate34596.9%
 
GDS1080.2%
 
Undefined1< 0.1%
 
2022-10-08T21:39:56.794020image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2022-10-08T21:39:56.872139image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:56.983451image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length5
Mean length5.43414
Min length3
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
0
47840 
1
 
2160
ValueCountFrequency (%) 
04784095.7%
 
121604.3%
 
2022-10-08T21:39:57.217827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

previous_cancellations
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.01674
Minimum0
Maximum13
Zeros49619
Zeros (%)99.2%
Memory size390.6 KiB
2022-10-08T21:39:57.280301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum13
Range13
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.287856613
Coefficient of variation (CV)17.19573554
Kurtosis1004.413708
Mean0.01674
Median Absolute Deviation (MAD)0
Skewness28.90866083
Sum837
Variance0.08286142963
MonotocityNot monotonic
2022-10-08T21:39:57.374060image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
04961999.2%
 
12460.5%
 
2520.1%
 
3250.1%
 
1120< 0.1%
 
415< 0.1%
 
513< 0.1%
 
69< 0.1%
 
131< 0.1%
 
ValueCountFrequency (%) 
04961999.2%
 
12460.5%
 
2520.1%
 
3250.1%
 
415< 0.1%
 
ValueCountFrequency (%) 
131< 0.1%
 
1120< 0.1%
 
69< 0.1%
 
513< 0.1%
 
415< 0.1%
 

previous_bookings_not_canceled
Real number (ℝ≥0)

ZEROS

Distinct57
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.20274
Minimum0
Maximum72
Zeros47735
Zeros (%)95.5%
Memory size390.6 KiB
2022-10-08T21:39:57.502079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.803691093
Coefficient of variation (CV)8.896572422
Kurtosis537.4106085
Mean0.20274
Median Absolute Deviation (MAD)0
Skewness19.61142262
Sum10137
Variance3.253301558
MonotocityNot monotonic
2022-10-08T21:39:57.611428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
04773595.5%
 
19561.9%
 
23700.7%
 
32100.4%
 
41480.3%
 
51120.2%
 
6760.2%
 
7500.1%
 
9420.1%
 
8390.1%
 
Other values (47)2620.5%
 
ValueCountFrequency (%) 
04773595.5%
 
19561.9%
 
23700.7%
 
32100.4%
 
41480.3%
 
ValueCountFrequency (%) 
721< 0.1%
 
711< 0.1%
 
691< 0.1%
 
671< 0.1%
 
651< 0.1%
 
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
A
34889 
D
8675 
E
 
3096
F
 
1299
G
 
899
Other values (4)
 
1142
ValueCountFrequency (%) 
A3488969.8%
 
D867517.3%
 
E30966.2%
 
F12992.6%
 
G8991.8%
 
B4881.0%
 
C4170.8%
 
H2350.5%
 
L2< 0.1%
 
2022-10-08T21:39:57.752095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:57.830146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:57.980258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
A
27357 
D
12577 
E
3924 
F
 
1839
C
 
1305
Other values (5)
2998 
ValueCountFrequency (%) 
A2735754.7%
 
D1257725.2%
 
E39247.8%
 
F18393.7%
 
C13052.6%
 
G11852.4%
 
B10792.2%
 
H3130.6%
 
I2390.5%
 
K1820.4%
 
2022-10-08T21:39:58.081319image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:58.159445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:58.331251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

booking_changes
Real number (ℝ≥0)

ZEROS

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.29496
Minimum0
Maximum21
Zeros39823
Zeros (%)79.6%
Memory size390.6 KiB
2022-10-08T21:39:58.456288image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7400064531
Coefficient of variation (CV)2.508836632
Kurtosis67.16072387
Mean0.29496
Median Absolute Deviation (MAD)0
Skewness5.417086108
Sum14748
Variance0.5476095506
MonotocityNot monotonic
2022-10-08T21:39:58.536226image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%) 
03982379.6%
 
1727414.5%
 
220184.0%
 
35231.0%
 
42120.4%
 
5710.1%
 
6310.1%
 
713< 0.1%
 
811< 0.1%
 
96< 0.1%
 
Other values (9)18< 0.1%
 
ValueCountFrequency (%) 
03982379.6%
 
1727414.5%
 
220184.0%
 
35231.0%
 
42120.4%
 
ValueCountFrequency (%) 
211< 0.1%
 
181< 0.1%
 
172< 0.1%
 
161< 0.1%
 
153< 0.1%
 

deposit_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
No_Deposit
49839 
Refundable
 
92
Non_Refund
 
69
ValueCountFrequency (%) 
No_Deposit4983999.7%
 
Refundable920.2%
 
Non_Refund690.1%
 
2022-10-08T21:39:58.645584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:58.708081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:58.801843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

days_in_waiting_list
Real number (ℝ≥0)

ZEROS

Distinct92
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5704
Minimum0
Maximum379
Zeros49116
Zeros (%)98.2%
Memory size390.6 KiB
2022-10-08T21:39:58.895554image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum379
Range379
Interquartile range (IQR)0

Descriptive statistics

Standard deviation14.79030016
Coefficient of variation (CV)9.418173817
Kurtosis200.4084546
Mean1.5704
Median Absolute Deviation (MAD)0
Skewness12.8526275
Sum78520
Variance218.7529789
MonotocityNot monotonic
2022-10-08T21:39:59.027682image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
04911698.2%
 
581140.2%
 
87470.1%
 
38340.1%
 
63340.1%
 
122330.1%
 
223260.1%
 
65260.1%
 
77250.1%
 
17622< 0.1%
 
Other values (82)5231.0%
 
ValueCountFrequency (%) 
04911698.2%
 
17< 0.1%
 
22< 0.1%
 
49< 0.1%
 
53< 0.1%
 
ValueCountFrequency (%) 
3794< 0.1%
 
33011< 0.1%
 
2597< 0.1%
 
23619< 0.1%
 
2242< 0.1%
 

customer_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
Transient
35343 
Transient-Party
12430 
Contract
 
1864
Group
 
363
ValueCountFrequency (%) 
Transient3534370.7%
 
Transient-Party1243024.9%
 
Contract18643.7%
 
Group3630.7%
 
2022-10-08T21:39:59.146501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:59.224650image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:59.334340image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length9
Mean length10.42528
Min length5

average_daily_rate
Real number (ℝ)

ZEROS

Distinct6173
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.9423424
Minimum-6.38
Maximum510
Zeros1166
Zeros (%)2.3%
Memory size390.6 KiB
2022-10-08T21:39:59.428131image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-6.38
5-th percentile35
Q167.5
median92.5
Q3125
95-th percentile191
Maximum510
Range516.38
Interquartile range (IQR)57.5

Descriptive statistics

Standard deviation49.03909248
Coefficient of variation (CV)0.4906738355
Kurtosis2.07847995
Mean99.9423424
Median Absolute Deviation (MAD)27.5
Skewness0.9545451394
Sum4997117.12
Variance2404.832591
MonotocityNot monotonic
2022-10-08T21:39:59.560967image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
011662.3%
 
6511042.2%
 
7510182.0%
 
856511.3%
 
906481.3%
 
956331.3%
 
805541.1%
 
485391.1%
 
1155361.1%
 
604700.9%
 
Other values (6163)4268185.4%
 
ValueCountFrequency (%) 
-6.381< 0.1%
 
011662.3%
 
16< 0.1%
 
1.291< 0.1%
 
1.562< 0.1%
 
ValueCountFrequency (%) 
5101< 0.1%
 
5081< 0.1%
 
451.51< 0.1%
 
426.251< 0.1%
 
4021< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
none
45019 
parking
4981 
ValueCountFrequency (%) 
none4501990.0%
 
parking498110.0%
 
2022-10-08T21:39:59.679830image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:39:59.742982image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:59.836707image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length4
Mean length4.29886
Min length4

total_of_special_requests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.71266
Minimum0
Maximum5
Zeros24493
Zeros (%)49.0%
Memory size390.6 KiB
2022-10-08T21:39:59.914812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.833804311
Coefficient of variation (CV)1.16998893
Kurtosis0.9146507264
Mean0.71266
Median Absolute Deviation (MAD)1
Skewness1.084228654
Sum35633
Variance0.695229629
MonotocityNot monotonic
2022-10-08T21:40:00.012947image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
02449349.0%
 
11723434.5%
 
2667913.4%
 
313582.7%
 
42130.4%
 
523< 0.1%
 
ValueCountFrequency (%) 
02449349.0%
 
11723434.5%
 
2667913.4%
 
313582.7%
 
42130.4%
 
ValueCountFrequency (%) 
523< 0.1%
 
42130.4%
 
313582.7%
 
2667913.4%
 
11723434.5%
 

arrival_date
Categorical

HIGH CARDINALITY

Distinct793
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size390.6 KiB
2015-12-05
 
173
2016-06-24
 
148
2016-05-26
 
143
2016-06-06
 
135
2017-02-25
 
133
Other values (788)
49268 
ValueCountFrequency (%) 
2015-12-051730.3%
 
2016-06-241480.3%
 
2016-05-261430.3%
 
2016-06-061350.3%
 
2017-02-251330.3%
 
2015-10-021280.3%
 
2017-05-251240.2%
 
2015-10-151220.2%
 
2015-10-191200.2%
 
2016-03-241190.2%
 
Other values (783)4865597.3%
 
2022-10-08T21:40:00.164439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-10-08T21:40:00.289397image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size195.3 KiB
0
45962 
1
 
4038
ValueCountFrequency (%) 
04596291.9%
 
140388.1%
 
2022-10-08T21:40:00.351916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Interactions

2022-10-08T21:39:38.360818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:38.530311image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:38.670937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:38.839919image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:38.984273image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.140880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.297130image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.463747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.598911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.723896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.848903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:39.991197image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:40.147631image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:40.319499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:40.486639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:40.631900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:40.772511image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:40.897498image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.127262image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.283491image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.424098image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.588369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.744580image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.853956image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:41.982257image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.105444image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.230423image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.355411image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.480397image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.658557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.836597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:42.990301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:43.139655image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:43.295939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:43.452121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:43.623978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:43.814339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:43.939579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.126721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.269735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.425896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.566505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.692585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.861764image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:44.999703image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:45.124823image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:45.272472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:45.428778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:45.631808image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:45.758294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:45.980034image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.167508image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.314646image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.432427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.573036image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.698023image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.823978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:46.979008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:47.132601image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:47.303012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:47.491563image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:47.638860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:47.787220image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:47.922384image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.079999image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.220775image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.330138image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.455135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.564504image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.705108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.830778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:48.975767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.113445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.246364image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.371317image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.520370image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.660979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.770342image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:49.910952image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.053783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.210035image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.335022image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.460009image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.572803image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.698166image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.854436image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:50.979421image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.098356image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.213837image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.338829image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.590184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.716765image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.857377image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:51.966738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:52.124732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:52.233447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:52.358426image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:52.467793image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:52.592775image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:52.718831image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2022-10-08T21:40:00.419652image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-08T21:40:00.635852image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-08T21:40:00.870249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-08T21:40:01.126275image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-10-08T21:40:01.434602image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-10-08T21:39:53.055258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:53.690985image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-10-08T21:39:54.051646image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Sample

First rows

hotellead_timestays_in_weekend_nightsstays_in_week_nightsadultschildrenmealcountrymarket_segmentdistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typeassigned_room_typebooking_changesdeposit_typedays_in_waiting_listcustomer_typeaverage_daily_raterequired_car_parking_spacestotal_of_special_requestsarrival_datedummy_children
0City_Hotel217132noneBBDEUOffline_TA/TOTA/TO000AA0No_Deposit0Transient-Party80.75none12016-09-010
1City_Hotel2012noneBBPRTDirectDirect000DK0No_Deposit0Transient170.00none32017-08-250
2Resort_Hotel95252noneBBGBROnline_TATA/TO000AA2No_Deposit0Transient8.00none22016-11-190
3Resort_Hotel143262noneHBROUOnline_TATA/TO000AA0No_Deposit0Transient81.00none12016-04-260
4Resort_Hotel136142noneHBPRTDirectDirect000FF0No_Deposit0Transient157.60none42016-12-280
5City_Hotel67222noneSCGBROnline_TATA/TO000AA0No_Deposit0Transient49.09none12016-03-130
6Resort_Hotel47022childrenBBESPDirectDirect000CC0No_Deposit0Transient289.00none12017-08-231
7City_Hotel56030childrenBBESPOnline_TATA/TO000BA0No_Deposit0Transient82.44none12016-12-081
8City_Hotel80042noneBBFRAOnline_TATA/TO000DD0No_Deposit0Transient135.00none12017-05-020
9City_Hotel6222childrenBBFRAOnline_TATA/TO000AA0No_Deposit0Transient180.00none12016-08-071

Last rows

hotellead_timestays_in_weekend_nightsstays_in_week_nightsadultschildrenmealcountrymarket_segmentdistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typeassigned_room_typebooking_changesdeposit_typedays_in_waiting_listcustomer_typeaverage_daily_raterequired_car_parking_spacestotal_of_special_requestsarrival_datedummy_children
49990Resort_Hotel283282noneBBGBROffline_TA/TOTA/TO000AA0No_Deposit0Contract72.20none12017-06-290
49991Resort_Hotel197282noneUndefinedGBROffline_TA/TOTA/TO000DD1No_Deposit0Transient114.90none02016-06-010
49992City_Hotel414022noneHBDEUGroupsTA/TO000AA0No_Deposit0Transient-Party122.40none12017-07-130
49993City_Hotel225242noneBBBRAOnline_TATA/TO001AA0No_Deposit0Group70.03none12016-10-200
49994City_Hotel73022noneSCFRAOnline_TATA/TO000AA0No_Deposit0Transient79.20none12017-01-270
49995Resort_Hotel172022childrenBBPRTDirectDirect000AA1No_Deposit0Transient73.39none12016-10-071
49996Resort_Hotel48042noneFBPRTDirectDirect000AB2No_Deposit0Transient158.00none02015-09-010
49997City_Hotel155042noneBBDEUOffline_TA/TOTA/TO000AA0No_Deposit0Transient82.50none12017-07-260
49998Resort_Hotel140252noneHBGBRDirectDirect000GG0No_Deposit0Transient143.00none02016-04-280
49999City_Hotel12212noneBBDEUOnline_TATA/TO000AA0No_Deposit0Transient171.33none12016-09-180

Duplicate rows

Most frequent

hotellead_timestays_in_weekend_nightsstays_in_week_nightsadultschildrenmealcountrymarket_segmentdistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typeassigned_room_typebooking_changesdeposit_typedays_in_waiting_listcustomer_typeaverage_daily_raterequired_car_parking_spacestotal_of_special_requestsarrival_datedummy_childrencount
1413City_Hotel134011noneBBPRTGroupsTA/TO000AA0No_Deposit0Transient-Party75.00none02017-02-25036
2012City_Hotel377022noneHBDEUOffline_TA/TOTA/TO000AA0No_Deposit0Transient-Party115.00none12016-10-14035
1980City_Hotel320022noneHBDEUOffline_TA/TOTA/TO000AA0No_Deposit0Transient-Party115.00none12016-08-18034
747City_Hotel48021noneBBESPGroupsTA/TO000AA0No_Deposit0Transient-Party65.00none12017-02-22033
1759City_Hotel213131noneHBPRTGroupsTA/TO000AA1No_Deposit0Transient-Party104.00none02017-08-28033
1872City_Hotel257022noneHBPRTOffline_TA/TOTA/TO000AA0No_Deposit0Transient101.50none02015-07-01032
2023City_Hotel405022noneHBDEUOffline_TA/TOTA/TO000AA0No_Deposit0Transient-Party114.40none02017-07-04032
940City_Hotel69212noneBBPRTGroupsTA/TO000AA0No_Deposit58Transient-Party85.67none02015-10-25030
1859City_Hotel256022noneHBDEUOffline_TA/TOTA/TO000AA0No_Deposit0Transient-Party115.00none12016-06-15030
2030City_Hotel414022noneHBDEUGroupsTA/TO000AA0No_Deposit0Transient-Party122.40none12017-07-13029